Random planar graphs and the London street network 
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In this paper we analyse the street network of London both in its primary and dual representation. 
To understand its properties, we consider three idealised models based on a grid, a static random 
planar graph and a growing random planar graph. Comparing the models and the street network, 
we find that the streets of London form a self-organising system whose growth is characterised by 
a strict interaction between the metrical and informational space. In particular, a principle of least 
effort appears to create a balance between the physical and the mental effort required to navigate 
the city. 



I. INTRODUCTION 

Urban growth has been widely analysed in the last 
century using ideas from social physics and urban eco- 
nomics [H- In fact cities, as natural phenomena, provide 
an iconic paradigm for the science of complexity, both 
with respect to their allometric scaling laws that relates 
them to the celebrated Zipf 's law for population ranks 
and for the complexity of their transport patterns that 
have been analysed both in the context of fractal geom- 
etry 0] and network theory 0, H, [|| . 

Graph theory provides a natural environment to study 
urban growth as far back as 1736 , Euler applied graph 
theory to solve an urban problem, the well known 
Konigsberg bridges problem [7] , thus relating a metrical 
problem to a topological one. 

A graph © is a very simple object, i.e. an ensemble 
of V vertices representing objects and E edges represent- 
ing the relations between the objects, © = {V, E}. With 
this level of abstraction, graphs have been applied in ge- 
ographical studies in different ways, for instance to study 
the patterns of urban commuting Q , the spread of infec- 
tious diseases 9] and networks of the retail system (lfjj . 

If we assume that the vertices of a graph are the street 
intersections in a city and the extremes of dead end roads 
(or cul-de-sacs) and the edges the street fragments con- 
necting the intersections, we obtain a so-called street net- 
work. In particular we call this representation a primary 
representation of the street network following the termi- 
nology in [ll[ • Such a street network is a strange network 
when compared to other social or biological networks fl2| 
in the sense that it is embedded in the Euclidian space 
and the edges do not cross each other. In g raph theory, 
such a network is called a planar graph [131 ] . 

The study of planar graphs has not received much at- 
tention in physics for two main reasons. The first is that 
the planarity criteria is not easy to overcome using the 
calculus. Therefore a lack of analytical results has dis- 
couraged analysts in dealing with such graphs. The sec- 
ond is that planar graphs can appear trivial in both their 
topological and geometrical properties. Regarding the 
first issue, we believe that since planar graphs represent 
a class of important phenomena, simulations can be used 



to quantify the basic properties of such graphs. Regard- 
ing the second issue, we note that the current research 
in the field is limited to static planar graphs. In this pa- 
per, we introduce a new class of models of growing planar 
graphs that show more articulated properties than their 
static counterparts. 

Moreover in the study of street networks, there is con- 
siderable interest in the so-called dual representation, 
that is the representation in which the streets are vertices 
and two vertices are connected whenever the streets they 
represent intersect (3lj . This representation describes the 
information content of the street network [T3|, in the 
sense that it represents the way a person navigates the 
city. To understand this concept, we need to refer to 
our personal experience when we move from one place 
to another in the city. In such a case, we do not think 
of all the street segments we cross to go from one point 
to another, but only the roads we move on, that are the 
vertices of the dual representation. Hence to cross a large 
city (like London) , we only need a small amount of infor- 
mation such as the street names (the vertices of the dual 
representation) which we need to cross the city. 

This concept will become clearer later. For now it is 
important to mention that it has been observed that the 
distribution of the number of connections (the degree dis- 
tribution) of the vertices of the dual representation of 
street networks is often scale- free [lij. This observation 
relates the phenomenology of urban growth to a wide 
range of scale-free phenomena through network theory 
representation and allows us to think of the growth of a 
city using an informational approach. 

In this paper, we analyse the street network of London 
in its primary and dual representation. To contextualise 
the results, we first introduce a grid model to simulate 
a maximal ordered city and then two stochastic mod- 
els, one static and one a growth model, to simulate a 
maximal random city. In the primary representation, we 
construct measures in the topological and metrical space, 
in the cycle space, and in the information space. In the 
dual representation, we generate measures in the topo- 
logical and information space. Notably we find that the 
structure of London streets tends to be a compromise be- 
tween a growing random city and a grid-like city in the 
sense that it is self-organised in a way that minimises 
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both the physical and the informational effort required 
in navigating the city. 

The importance of this research resides first of all in the 
quality of the analysed data (see appendix A for details), 
then in the detailed analysis of static planar graphs, and 
lastly in the introduction and analysis of growing random 
planar graphs 



A. The street network of London 

London began in 43AD as a Roman settlement and 
has had comparatively uninterrupted urban growth every 
since making it the largest metropolis in Western Europe. 
To establish the borders of a city is still a controversial 
topic [15l | and hence, to build our network, we consider 
all the streets contained in a circle of radius 28.26 Km, 
centred on the centroid of the borough called the City of 
London, where the first Roman settlement was located. 
This area contains some 95 percent of the population of 
the 33 boroughs that comprise the Greater London Au- 
thority which is also bounded by the M25 orbital road. 
In this way, we obtain a network with V — 163878 inter- 
sections, the vertices, and E = 199931 street segments, 
the edges (see the left panel of Fig[T]) . The London street 
network (hereafter LN) is a weighted network where the 
weights Wij of the edges connecting vertex i to vertex j 
are defined by the length I of the street fragment they 
represent. A key measure for such a network is the de- 
gree ki of a vertex i defined as the number of vertices 
vertex i is connected to. The average degree for LN is 
< fe >= ^ ~ 2.44, a very small value, close to that of a 
tree, it is due to the massive presence of dead end roads 
as we can see from the right panel of Fig[TJ 




FIG. 1: Left panel: the London street network considered 
in this research. Right panel: a localised view of the same 
network. 

An important measure that we will use in the next sec- 
tion is the density distribution of the length I of the street 
segments, i.e. the weight distribution of LN, measured 
in meters. We show it in FigJ5]and we find that it is well 
fitted by the function: 
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where the average length for an edge is 95.73mi. The 
properties of EqQ] are scale-free for a long range of dis- 
tance, and the long distance cut-off ensures that the vari- 
ance of the distribution is finite. 
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FIG. 2: Measure of the length distribution P(l) for the street 
network of London and for the GRPG (growing random pla- 
nar graph). 



B. The Erdos-Renyi Random Planar Graph 

We first introduce a random model for a static planar 
graph. This is the only kind of random planar graph 
considered in literature as far as we know and we fol- 
low convention in calling it the Erdos-Renyi planar graph 
(hereafter ERPG) in [r|. 

To build an ERPG we start with a Poisson distribution 
of N points in a plane and we choose a distance r. To 
build the first segment, we randomly pick up two points 
of the distribution that have a distance less then r and 
we connect them. Then we continue to randomly pick up 
pairs of points P and Q in the given points distribution 
that have a distance less then r. If the segment PQ does 
not intersect any other line of the graph, we add it to 
the graph. The process ends when we add the desired 
number of edges E or when we arrive to the maximum 
allowed number of edges E < ^-V - o{V)\v\. 

Here we generate a realisation of the ERPG model with 
the same characteristics as the LN, that is the same num- 
ber of vertices and edges, and a distribution of points in a 
disc with the same radius as the LN. To obtain the same 
average length for the links, we choose r = 300mt. A lo- 
calised view of this realisation is shown in the left panel 
of Fig|3l where we should note that this graph is not 
necessarily fully connected. In particular, the realisation 
we took as a study sample is made of 2072 disconnected 
components, the largest one composed of 146965 vertices. 
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FIG. 3: Left panel: a localised view of a realisation of the 
ERPG. Right panel: a localised view of a realisation of the 
Grid Model with degree < k >= 2.44. 



C. The Growing Random Planar Graph 



obtain different shapes of cities. Moreover in the GRPG 
there are no unconnected components as in the ERPG. 




FIG. 4: Left panel: the realisation of the GRPG considered 
here, where the white dot is the origin of the growth of the 
model. Right panel: a localised view of the same network. 



The ERPG is a static model for a planar graph. Since 
cities are often growing systems that assume their shape 
over the centuries, we introduce a novel class of ran- 
dom planar graphs which we call growing random planar 
graphs (hereafter GRPG). We will show how the growth 
of this graph implies different emerging properties from 
the ERPG. 

To build a GRPG we start with a segment of length A 
embedded in the Euclidean plane. At each time step, we 
randomly pick up one of the vertices of the graph. We 
draw from it a new segment of length I according to an 
isotropic distance distribution /(Z, 0) = /(Z), where / is 
a probability density function. If the new segment does 
not intersect any of the existing segments, then we add 
it to our graph. This process creates a tree planar graph 
with average degree < k >= ^ = 2 ( v ~^ . To obtain 
a planar graph that is not a tree and that has average 
degree < k >> 2, every n time steps we randomly pick a 
vertex i from the existing graph. Next we consider the set 
of vertices in the graph that are within a radius Iq from 
vertex i, where lo is randomly extracted from the distri- 
bution /(Z), and forms a segment with vertex i that does 
not intersect any other segment of the graph. Then we 
randomly pick up a vertex j from this set of vertices and 
we add the line ij to the graph. The process continues 
until we reach the desired number of edges or vertices. 
The average degree of the vertices is then completely de- 
termined by n, < k >= 2 + 2/n and thus the GRPG 
properties are completely determined by the choice of n 
and /(Z). 

Here we analyse a realisation of a GRPG with the same 
number of vertices and edges as the LN, f(l) given by 
EqU] (see Fig[2]) and n = 5. We show this realisation 
in FigfJ] where the white dot shows where the first seg- 
ment was located. We also notice how the power law 
distribution for the length of the edges allows long range 
connections, thus creating independent centres outside of 
the main cluster city which leads to an overall asymmet- 
ric form. Changing the distribution /(Z) it is possible to 



D. The Grid 

The last model we introduce is that of a regular grid 
(GM hereafter) to which we randomly add dead end 
roads to obtain the same average degree of the LN. We 
introduce this graph to simulate a maximally ordered 
city. 

We start with a square grid of n horizontal lines and n 
vertical lines. As in the previous networks, the vertices 
are defined by the intersections of the lines and the edges 
of length I by the lines connecting two intersections. In 
this way, we create n 2 vertices with degree 4. To create 
the same average degree of the LN, for m time-steps, we 
add a new line in the following way. We randomly pick 
up an edge from the network and from its midpoint we 
draw a new line of length 1/2 — a, perpendicular to the 
selected edge, where a = o(l). This process creates 2 
new lines and two new vertices of degree 1 and 3 at each 
time-step. The resulting network has 
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vertices and it is easy to show that the average degree of 
the network is given by the following relation: 



< k >= 4 



(n 2 + m) 
n 2 + 2m 



(3) 



Hence to find the correct values of n and m in build- 
ing our grid model, it is sufficient to solve the system of 
equations [2] and [3] with the values of V and < k > taken 
by the LN, and we find n 2 = 36053.2 and m = 63912.4. 
Considering that we need integer numbers, we run a sim- 
ulation with n — 190 and m — 63912 and this gives us 
the same average degree as LN. In the right panel of Fig [3] 
we show a localised realisation of such a graph. 
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II. COMPARISON BETWEEN THE LONDON 
STREET NETWORK AND THE DIFFERENT 
MODELS IN THEIR PRIMARY 
REPRESENTATION 

In this section we compare the properties of the LN, 
the ERPG, the GRPG and the GM introduced in the last 
section. This section is divided in three subsections where 
we study the topological and geometrical properties, the 
measures in the cycle space, and the centrality measures 
which are all analysed separately. Many of the measures 
regarding the ERPG and the GM are trivial and are not 
considered. 



In the right panel of Fig[5l we show the comparison of 
the average length of the road fragments l(r) as a func- 
tion of the distance from the centre. In this case, we 
see that the model agrees very well with the real net- 
work for the first 15Km. The average increase in the 
lengths of the edges of the considered graphs is a clear 
evidence of the growth of both the systems in which on 
average, the centres of the graphs are filled with short 
edges and the periphery is sparser where there is space 
for longer edges. The large fluctuations that are evident 
in the GRPG model for large values of r are due to finite 
size effects. 



A. Topological and geometrical properties 

In a planar graph, topological and geometrical proper- 
ties are very much interrelated. We begin by considering 
a geometrical feature, the spatial density of intersections 
p. The density of the intersections, or vertices, is an 
emergent property of the complex organisation of a grow- 
ing planar graph. In the case of the ERPG it is Poisson, 
while in the case of the GM it is a uniform distribution. 

In the left panel of FigO we show the measure of the 
radial density p(r) of the intersections in LN compared to 
the one measured in the GRPG. In the case of LN, we see 
that p{r) has a density plateau up to a radius of approxi- 
mately 3.5Km, then the density drops fast until a radius 
of around 7Km from the centre is reached. After that, 
the behaviour changes abruptly and p(r) decays linearly 
toward the periphery. In the case of the GRPG, we can 
see that the growth of the graph produces a density dis- 
tribution that is a smooth bell shaped decaying function 
of the distance. The plateau that is in LN is missing and 
the function decays rapidly to a radius around of 15Km 
producing a random city that has an extension that is 
a half of its real counterpart. The linear decay of the 
density function for LN is related to the city's historical 
suburban growth and can be related to the phenomena 
we call urban sprawl (l8| . 

This behaviour can be better understood if we look 
at Fig[6] where we show a representation of the shape 
for the density distribution for the LN (top panel) and 
the GRPG (central panel). For the LN, we can see that 
there is a large concentration of intersections in the cen- 
tre, while the suburbs have a more homogeneous shape 
characterised by high peaks. For the GRPG, the overall 
shape does not have any large discontinuities. In both 
the panels, we notice how the power law effect of the 
edge length distribution of EqQ] produces local inhomo- 
geneous patterns as isolated peaks. This effect is more 
evident for the LN. The reason is that London grew to in- 
corporate pre-existing town centres. In the bottom panel 
of FiglHl we show the contour plot for the intersection 
density of the LN with the position of the town centres 
superimposed on this, noting how the density pattern is 
correlated with them. 
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FIG. 5: Left panel: the radial density p(r) of intersections for 
the LN and the GRPG. The tail of the measure for London 
is well fitted by a linear function (Adj. R 2 = 0.99029). Right 
panel: measures of the average edge length l(r) versus the 
distance from the centre for the LN and for the GRPG. 

Even if planar graphs in nature are not characterised 
by a high degree of connectivity for the vertices, the de- 
gree distribution of different planar graphs show non triv- 
ial characterisations. For topological aspects, our net- 
works are completely specified by their weighted adja- 
cency matrix W = {a/ y }, where tUy = Uj, for < i,j < 
V, kj being the length of the street segment connect- 
ing vertex i and vertex j, if vertex i and vertex j are 
connected, and wij — otherwise. The degree ki of ver- 
tex i is defined as the number of connections of vertex i, 
ki = @{u>ij ) and in this case, it represents the number 
of streets intersecting at the given intersection. In the top 
left panel of FigO we show the degree distribution for the 
LN and the ERPG model using a linear scale. It is worth 
noting that vertices with degree two were suppressed in 
the construction of LN. We observe two peaked distribu- 
tions with a maximum around the average degree, where 
it is possible to appreciate that the peak for the LN is 
much higher than the one for the ERPG model. More- 
over the maximum degree for the LN is 8 while it is 12 
for the ERPG. In the right panel of the same figure, we 
observe the behaviour of the tail for the same distribu- 
tions. It seems that they are both ill-defined distribu- 
tions, very similar to the ones found for ant galleries in 
[l9l ]. but that to claim they show exponential behaviour 
would be misleading. In the top right panel of the same 
figure, we show the degree distribution for the GRPG 
model. In this case, the distribution is not peaked and 
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FIG. 6: Upper panel: the intersection density profile for the 
LN. Central panel: the same measure for the GRPG. Bottom 
panel: the density contours for the LN. The black circles show 
the current (2006) position of the main town centres. 

the exponential behaviour is clearly distinguished with a 
maximum degree k max — 24. This observation is very 
important. In fact, LN is a growing system and the fact 
that it does not display an exponential degree distribu- 
tion relates to its particular organisation more than to 



its similarities to the ERPG. 

In weighted graphs, the strength of vertices often pro- 
vides important information about the system and is 
strictly correlated to the degree of the vertices [2(J. In 
our case, the strength Si of vertex i is defined as the sum 
of the lengths of the street fragments intersecting that 
vertex, Sj = X^^y- I n our three samples, the strength 
measures and their correlations are quite diverse. In the 
central panels of FigO we show the strength distribu- 
tion for the LN, the ERPG and the GRPG. For LN (in 
the central left panel), the strength distribution shows a 
clear scale-free behaviour with exponent — 3.87±0.06. We 
find a similar behaviour in the GRPG (in the central left 
panel) even if its scale free behaviour is not well defined, 
while for the ERPG model (in the central right panel), 
the strength distribution is a peaked function with an 
exponential tail. 

To understand the correlations between strength and 
degree of a vertex, in Fig[7]we plot the average strength 
< s(k) > which is measured as a function of k. In the 
case of LN (in the bottom left panel), < s(k) > displays 
growing behaviour that can be fitted with an exponential 
curve within the error bars. In the bottom right panel on 
a double logarithmic scale, we can observe how < s(k) > 
displays linear growth, < s(k) >—< I > k, for the ERPG, 
where < I > is the average length of the edges . For the 
GRPG, this shows super-linear growth, < s(k) >oc fc 1 - 34 , 
as observed in many other topological growing networks 

m. 

The last measure we show in this section is the average 
degree of the vertices as a function of the distance from 
the centre < k(r) >. This allows us to see how much the 
topological and metrical spaces are related. In FigEJ we 
show < fc(r) > for LN and GRPG. In the case of ERPG 
and GM, < k(r) > is just a constant function of r. In 
the case of LN, < k(r) > decays linearly from the centre 
to the periphery. In the GRPG, < k(r) > decays more 
rapidly. This decay function is a signature of the growth 
of the system where the centre is more densely connected. 



B. Measures in the cycle space 

It is interesting to observe a planar graph in its cycle 
space, that is the space formed by all the edges of the 
graph that are part of a closed polygon [l3[ . In fact it is 
in that space that many of the planar graph properties 
are best understood. 

The length of a cycle CI is defined as the number of 
its edges or vertices which, is an important number in 
understanding the geometry of the graph. In the GM, 
the cycle space is trivial. In the top panels of FigJSJ we 
show the measures related to the cycle lengths in our 
networks. The top left panel shows the frequency distri- 
bution P(Cl) for the cycle lengths for LN, the ERPG and 
the GRPG. It is interesting to note that this distribution 
has a power law tail with a very similar slope for three 
of the networks with exponent -3. The significant differ- 
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FIG. 7: Top left panel: the degree distribution P(k) for the 
LN and the ERPG. Top right panel: the degree distribu- 
tion P(k) for the LN, the ERPG and the GRPG on a semi- 
logarithmic scale. The parameter of the exponential function 
fitting the distribution for the GRPG has a standard devia- 
tion a = 0.02. Central left panel: the strength distribution 
P(s) for the LN and the GRPG on a double-logarithmic scale. 
Central right panel: the strength distribution P(s) for the 
ERPG on a semi-logarithmic scale. Bottom left panel: the 
average strength < s(k) > as a function of the degree k for 
the LN on a semi-logarithmic scale. Bottom right panel: the 
same function measured for the ERPG and the GRPG on a 
double-logarithmic scale. 



ences between LN and the random graphs is that in LN, 
cycles of length 4 and 5 are more numerous than cycles 
of length 3 and that the tail for the LN is much longer 
than the tails of the random networks. This is probably 
due to the existence of geographical constraints in that 
LN growth forces the creation of very large polygons (for 
instance around the Thames seen from the right panel of 
FigH]). In the right panel of FigOU we show the measure 
of the average cycle length < Cl(r) > as a function of 
its distance r from the centre for both the LN and the 
GRPG. The ERPG model is not included in the figure 
since in that case < Cl(r) > is a constant function of r. 
In both the LN and the GRPG model, < Cl[r) > is a 
growing function of r which is characteristic of growing 
systems where central polygons are smaller and the av- 
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FIG. 8: The average degree < k(r) > as a function of the 
distance from the centre for LN and the GRPG. The LN data 
are well fitted by a linear function. 



erage connectivity is larger. Nevertheless the growth of 
< Cl(r) > for LN is more steady and it is well fitted by 
a linear function. The decay behaviour of < Cl(r) > for 
large values of r is due to finite size effects. 

The area of the faces A is also a measure used to char- 
acterise urban networks j^. In the bottom left panel 
of FigEl w e show the frequency distribution for P(A) 
for the area A of the faces of LN, the ERPG and the 
GRPG. For the LN, we find a good agreement with the 
power law slope measured for the road network of Dres- 
den [22j ■ Interestingly we also find that other stochastic 
networks show a similar behaviour to those of London 
and Dresden, suggesting that the power law behaviour 
for the faces area distribution is not likely to be a sign of 
complex self-organisation of an urban system, nor of its 
growth. 

In the bottom right panel of FigfJl we show the mea- 
sure of the average area of the faces < A{r) > versus 
the distance from the centre r for the LN and the GRPG 
models. For the ERPG, < A(r) > is a constant function 
of r. As we expect, for LN and GRPG, the area of the 
faces is a growing function of r, supporting the hypothe- 
sis of a strong mono-centric component in the growth of 
the city. It is also interesting to note how the fluctuations 
grow with distance from the centre. 



C. Centrality measures 

The closeness centrality measures how much a vertex 
is to the traffic on the network, that is how much of the 
network is easily reachable from all its different vertices. 
It is defined as: 
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FIG. 9: Top left panel: the frequency distribution P(Cl) for 
the length of the cycles CI for the LN, the GRPG and the 
ERPG. The dashed line is a power law with slope —3. Top 
right panel: the measure of the average length of the cycles 
< Cl(r) > as a function of the distance r from the centre. 
Bottom left panel: the frequency distribution P(A) for the 
area of the faces A for the LN, the GRPG and the ERPG. 
The dashed line is a power law with slope —2. Bottom right 
panel: the measure of the average area of the faces < A(r) > 
as a function of the distance r from the centre. 



the static and growing models. For the LN, a plateau 
does not really exist. After peaking around 18Km, the 
distribution decreases with many fluctuations, but with 
an overall linear trend, to a maximum average distance 
of 40km. Then it appears that the most travel friendly 
pattern is that given by the GRPG, where we find a large 
peak around 13 km and a fast and smooth decay until 
32km. The extension of the GRPG city is smaller than 
in the other models, as we have already noted. What is 
interesting is the lack of a plateau for both the LN and 
the GRPG. 
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FIG. 10: Probability distribution for the inverse of the close- 
ness centrality P(l/C c ) for the LN, the ERPG, the GRPG 
and the GM. 



where dij is the sum of the lengths of the street seg- 
ments forming the shortest path between vertex i and 
vertex j. We believe that the inverse of Eq[J] 1/Cf, 
measured in Km, gives a better understanding of the dy- 
namics of those networks, since it represents the average 
metric distance between the intersection i and all the 
other intersections of the graph . This gives an effective 
understanding of the physical effort (the informational 
effort will be considered in the next section) expended 
in navigating a city. In Fig llOl we show the distribution 
P(l/C ) for each of our networks. We can also see how 
the networks are highly differentiated by this measure. 
The ERPG is the one which is less travel friendly, the 
vertices being more distant on average from all the other 
vertices, even if we only consider the connected portion. 
The majority of vertices lie on a plateau between 30Km 
and 46Km, that are the values where these vertices are 
uniformly distributed. 

On a travel friendly scale, the ERPG has lower central- 
ity than the GM. The distribution is similar, presenting 
a large plateau, but the plateau for the GM is now higher 
and thinner, between 26Km and 37Km and the tail falls 
exponentially for more than 10 Km. Still considering the 
travel friendly scale, the centrality of LN lies between 



III. THE DUAL REPRESENTATION AND THE 
ALIGNMENT PROBLEM 

The network of urban streets shows scale-free proper- 
ties for its degree distribution when it is considered in 
its dual representation, where the vertices are the streets 
and two vertices are connected if the streets they repre- 
sent intersect [ljj]. This is important for it allows us to 
look at the growth of cities in a novel way through an 
informative perspective. In this section, we examine in 
detail the properties of the dual street network of London 
(hereafter DLN) and we compare it to the properties of 
the dual representations of the three other models that 
we have introduced as our idealised baseline. 

The procedure to build a dual street network is to as- 
sign the same label or ID to the street fragments that 
belong to the same road using an alignment principle. 
Then the dual representation of a planar graph is a net- 
work in which the roads are vertices and two vertices are 
connected if the roads they represent intersect. The pro- 
cedure used to obtain a dual graph from a planar graph 
is shown in Fig llll In that construction, long roads with 
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a) b) 




FIG. 11: The process to create a dual graph from a street 
graph. Panel a: a fragment of the street graph of London, 
every different street segment has a different ID. Panel b: the 
same street network where the street segment ID are changed 
after applying an alignment principle (the ICNP). Panel c: 
the final dual graph representation of the street network of 
Panel b, where the street IDs become the vertices and two 
vertices are connected if the streets they represent intersect. 



the same ID connect to many roads, while short roads 
such as dead-ends, connect to just one or a very small 
number of other roads. In this way, hubs form at all 
scales producing a characteristic shape of the degree dis- 
tribution that is common to many self-organising systems 
0. 

An important issue is to find an algorithm to establish 
how different street fragments might belong to the same 
street, i.e. an algorithm to assign the ID to the differ- 
ent street fragments. In [6], a name-street approach is 
considered where two street segments are given the same 
ID if they have the same street name. Unfortunately, as 
noted in [4| , this approach does not consider the fact that 
in many cities, many streets share the same street name 
without intersecting. Also it is possible to find the same 
physical streets that have two or more separate names. 
London is rich in both of these phenomena. In our view, 
the efficient approach called Intersection Continuity Ne- 
gotiation (ICN) is worth considering [ll[ . This approach 
starts from the principle that two street fragments belong 
to the same road if the angle they form is close to 180 
degrees. Then the procedure of the ICN is to rank pairs 
of street fragments at a given intersection by the convex 
angle they form. Then the same ID is given to the street 
fragments that form the major convex angle. 

This approach is very efficient but in our view, it 
fails to correctly describe the situation shown in FigfT^l 
FiglH] shows how the ICN principle assigns the ID to 
different streets at an unusual crossroad. Referring to 




FIG. 12: Panel a: a generic crossroad with random labels. 
Panel 6: the same crossroad where IDs are reassigned by ICN 
principle. Panel c: the same crossroad where IDs are reas- 
signed by ICNP principle. 



the figure, it seems more plausible to use a negotiation 
principle that transforms street segment 3 into 1 as ICN 
does leaving the other ID unchanged (panel c of Fig f!!?]) . 
This situation appears in reality when dealing with a ring 
road or a beltway where other roads enter or exit. 

To fix this problem, we have extended the ICN princi- 
ple to the ICN Plus (ICNP) principle in which the ICN 
is considered in all the cases in which the larger convex 
angle is formed between road segments that are not adja- 
cent. In the case where two adjacent road segments form 
the largest convex angle as in Figfl^l we give them the 
same ID. But we do not change the ID of the other street 
segments intersecting the vertex, considering them as dif- 
ferent roads. A more precise description of the ICNP 
algorithm is given in appendix B. 

The networks obtained in this way are unweighted and 
undirected. They are called information networks since 
they describe the way people think about moving in a 
city. To go from one point to another in a city, we do not 
need to know all the street segments and intersections 
that join the two points but only the name of the roads 
that enable us to navigate. For instance, considering the 
top panels in Fig llli if we want to travel from street 
segment 1 on the extreme left to the street segment 19, 
we would normally go straight along road 1 and then 
turn right onto road 19 as shown in panel b and not go 
straight along line 1, then taking line 5, line 11, line 12, 
line 13 and eventually turning right onto line 19 as shown 
in panel a. So we can say that to go from line 1 to line 
19, just one unit of information is required, as is clear 
from panel c of the same figure. Then the maximum 
information required to cross a city is the diameter of 
its information network, not the diameter of its primal 
network, where the diameter D of a network is defined 
as the maximal shortest path connecting two vertices of 
that network. 

It is worth noting that whatever algorithm is used to 
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create the dual representation, it always contains bias. 
In our case, the longest road recognised has a length of 
around 17Km. The orbital M25, for example, is not 
recognised as a single road, nor are other important 
routes such as the A40, connecting the centre of Lon- 
don to Oxford. These biases are then reflected in the 
degree distribution whose exponential tail is not well un- 
derstood. Other solutions to the alignment problem are 
possible and research on the topic is active [23|. 



which represent the m lines added to the GM (see the 
introduction) to obtain the desired average degree. 



Dual representation 





V E < k> D < C(k) > 


DLN 
DERPG 
DGRPG 
DGM 


74782 107988 2.89 33 0.042 
54458 91732 3.36 243 0.31 
67052 222374 6.73 72 0.44 
64296 100250 3.12 13 



IV. DUAL ANALYSIS 

In this last section, we will draw the discussion to 
conclusion by analysing the properties of the dual rep- 
resentation of LN, of the ERPG (hereafter DERPG), of 
the GRPG (hereafter DGRPG) and of the GM (here- 
after DGM). These networks are purely topological, in 
the sense they are not embedded in Euclidean space per 
se. We will split the section in two parts. In the first 
part we examine the main topological properties of those 
binary networks such as degree distribution, clustering 
coefficient and nearest neighbour degrees, and in the sec- 
ond part, we analyse the network using an informational 
approach through centrality measures. 



TABLE I: Number of vertices V, number of edges E, average 
degree < k >, diameter D, and average clustering coefficient 
< C{k) > for the DLN, the DERPG, the DGRPG and the 
DGM. 



Primary representation Dual representation 




A. Topological properties 

In Tab HI we present the main topological properties 
for the dual representation of the considered networks. 
In the table, we show the number of vertices V, the 
number of edges E, the average degree < k >, the di- 
ameter D and the average clustering coefficient < C > 
for the four networks. We can already observe how differ- 
ent topologies in the primary representation give rise to 
very different dual networks. Remembering that in the 
dual representation, the number of vertices is the num- 
ber of different roads and that the number of edges is 
the number of intersections between different roads, we 
see that in the DLN there are a larger number of roads 
than those generated in the random networks. In spite 
of this, the diameter of the DLN is much smaller than 
the diameter of the random networks, a diameter that 
has the size of the same order of the logarithm of the 
number of vertices which is a small world property [24j . 
This means that even if random roads are longer than 
real ones, they are not organised to fill the space as effi- 
ciently as in the DLN. This effect is very much related to 
the angular distribution of the edges at the intersections 
of the primary graphs that generate the big differences in 
the average clustering coefficient to be explained below. 
In the case of the DGM, it has already been shown [4j 
that this is a bipartite graph in which one family of ver- 
tices represent the horizontal lines and the other family 
represent the vertical lines. Every vertex of one family is 
connected with all the vertices of the other family. From 
those vertices, small trees are generated (as in Fig fTS"]) 



FIG. 13: The dual representation of the Grid Model (left 
panel) is a bipartite graph (right panel) in which horizontal 
lines and vertical lines become different families of vertices. 

In the left panel of FigfT4l we show the degree dis- 
tribution P(k) and the cumulative degree distribution 
P(k* > k) for the dual network of London. The expo- 
nent of the best fitting line has been calculated for the 
cumulative distribution. From the degree distribution, 
we can see that a power law behaviour emerges with a 
fat tail. From the cumulative distribution, we can see 
how the tail of this distribution falls faster for large val- 
ues of the degree. The same behaviour has been observed 
at national scales [25[ which can been attributed to the 
natural boundaries of the UK viewed as an island. We 
can say the same thing in this case where a natural cut- 
off emerges for the finite sample size. For the tail, we 
also have to consider the above mentioned biases due to 
the choice of the alignment principle in the construction 
of the dual graph. 

In the right panel of FigQJJ we show the degree distri- 
bution P(k) for the DERPG and the DGRPG on a semi- 
log scale. Notably the maximum degree for the DERPG 
is kmax = 20 compared to kMax = 261 for the DLN 
and kMax = 229 for the DGRPG. We argue that the 
static planar graph has a structure that does not allow 
long "roads" to form, while the tree growing structure of 
the GRPG gives rise to "roads" with a length comparable 
to the ones of the LN. Moreover it is interesting to note 
how the distribution of the random networks is radically 
different from the LN in terms of its information space. 
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In the stochastic models, we observe an exponential be- 
haviour for the degree distribution. In the case of the 
DGRPG, this exponential behaviour encapsulates a fat 
tail that appears at a maximum degree kMax = 229. We 
are tempted to speculate that this exponential behaviour 
relates to the lack of informational organisation of such 
random systems. 




k k 



FIG. 14: Left panel: degree distribution P(k) and cumulative 
degree distribution P(k* > k) for the DLN on a double-log 
scale. Right panel: degree distribution for the dual network 
of the DERPG model and the dual network of the DGRPG 
model on a semi-log scale. The fat tail of the latter has been 
cut in this plot, but it appears at kMax = 229. 

The clustering coefficient or transitivity Ci for a vertex 
i is the ratio between the number of edges connecting 
each other the nearest neighbours of vertex i and the 
number of such possible edges, and it is defined as: 



The average clustering coefficient < c(k) > then counts 
the number of triangles in the graph. In the top pan- 
els of Figfl~5l we show the average clustering coefficient 

< c{k) > as a function of the degree k measured in our 
networks. In the left panel, measures for the DLN and 
the DGRPG are shown. The average clustering coeffi- 
cient for the DLN follows a power law with exponent 
—0.89 ± 0.01. This behaviour has already been noted 
in [llT | for most of the 1 mile-square samples considered. 
This effect at a larger scale makes it a characteristic sig- 
nature of the dual representation of an urban network. 
This scaling behaviour is well explained by the very low 
average clustering coefficient < c >« 0.04. This means 
that in the dual representations only a few triangles form 
and the larger the degree of a node, the less is the prob- 
ability that its neighbours are interconnected. The poor 
triangular structure of the dual representation of urban 
street network reflects the angular structure of the pri- 
mary graph where roads tend to be orthogonal and where 
cycles of length 4 or 5 are more likely to happen than cy- 
cles of length 3 (see top left panel of Figj9]). In the same 
panel, the average clustering coefficient < c(fc) > as a 
function of k is shown for the DGRPG. The values for 

< c(fc) > are much larger than the ones found in the 



DLN, with an average clustering coefficient < c >~ 0.4, 
an order of magnitude larger than that for the DLN. That 
is due to the greater probability in the random network 
for triangles to form (noting that triangles in the pri- 
mary space correspond to triangles in the dual space). 
The behaviour of < c(k) > is now less smooth and not 
well defined and its tail is much steeper than that in the 
DLN. 

In the top right panel of Fig|T51 we show the average 
clustering coefficient < c(k) > as a function of k for the 
DERPG. The average clustering coefficient is < c 
0.31, still an order of magnitude larger than the DLN, 
confirming the fact that there are more triangles in a 
random planar network than in an urban planar graph. 
Interestingly the shape of the measured function decays 
exponentially with the degree. 




1 10 100 1 10 100 



k k 

FIG. 15: Top panels: average clustering coefficient < c(k) > 
versus the degree k measured in the DLN and the DGRPG 
(left panel) and for the DERPG (right panel). Bottom panels: 
average nearest neighbours degree as a function of the degree 
< knn(k) > measured in the DLN and the randomised DLN 
(left panel), the DERPG and the DGRPG (right panel). 

The average nearest neighbours degree as a function 
of the degree < k nn {k) > quantifies the second order 
correlations of complex networks and is defined as: 

<k nn {k i )>=^k j P{k j \k i ) (6) 

where P{kj\kf) is the conditional probability that a ver- 
tex with degree ki has a neighbour with degree kj. We 
have to be careful to analyse the measures of < knn(k) >. 
In fact it has been shown that such networks reveal struc- 
tural correlations that are due to the degree distribution 
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and to its cut off for large degrees [26j . Hence, for under- 
standing the correlations of the system, it is important to 
compare the actual < knn(k) > with the one obtained in 
a randomised network. In the bottom left panel of Fig. 15, 
we show the measure of < knn(k) > as a function of the 
degree k for the DLN and the same measure for a network 
derived from DLN by rewiring all the edges and keeping 
the degree sequence unchanged. In this way we can see 
that in the DLN, there are disassortative correlations for 
small values of the degree, where small degree vertices 
tend to be connected with high degree ones, while for 
larger degrees, the network looks uncorrelated. In the 
bottom right panel of Fig{T5l we show the same measure 
for the ERPG and the GRPG. The former shows a struc- 
tural disassortative behaviour while the latter shows a 
structural assortative behaviour, where high degree ver- 
tices tend to connect to high degree vertices. 



B. Centrality measures 

The shortest path dij from vertex i to vertex j is de- 
fined as the number of edges that form the geodesic 
that connects vertex i to vertex j and we have that 
1 < dij < D, where D is the diameter of the graph. In 
the top panels of Figll6[ we show the frequency distribu- 
tions P{d) for the shortest paths measured between each 
pair of vertices in our networks. This is a very important 
measure since it quantifies the informational content of 
the network where dij represents the mental effort we 
incur in navigating a city. In this context, we can see 
how the distribution for the DLN is displaced in between 
the DGM, the easiest "city" to navigate, and the random 
models, the most difficult "cities" in which to move. In 
the top left panel of Fig[TBJ we show P(d) for the DLN 
and the GM. In the case of the DLN, P(d) is well fitted 
by a Gaussian distribution centred at p c — 11.74 ± 0.05, 
with a width or variance of a = 7.14±0.09. pc represents 
the average information required to move from one point 
to another in the city. 

For the DGM the distribution P(d) is well fitted by 
a lognormal distribution centred in pc — 3.940 ± 0.006 
with width a = 0.230 ± 0.001. That means that the 
average information to travel in a grid-like city is much 
less than the one we find IN a large city like London as we 
could have been expected. In the right panel of the same 
figure, we show the measures of P(d) for the DLN and 
the random networks. A semi-log scale is used to better 
resolve the tails of the distributions. For the random 
networks, we find that the distributions are shifted to the 
right in respect of the DLN, meaning that the information 
required to travel between two random vertices is larger 
than in the real network. 

P(d) for the DGRPG is still well fitted by a Gaus- 
sian distribution even if we can see that the tale be- 
haves slightly differently. The centre of the distribution is 
p c = 22.70 ± 0.02 and the width a = 20.1 ± 0.3. The case 
of the DERPG is interesting too for the measure com- 



puted on the connected part of the network is smaller 
than other networks under consideration. Still the in- 
formational content of the network is smaller than those 
found in the other networks, in the sense that to navigate 
the ERPG, much more mental effort is needed. We can 
see from the figure how the tail of the distribution decays 
faster than the Gaussian curve. The centre of the distri- 
bution is at p c = 96.9 ± 0.3 and its width a = 99.0 ± 0.7. 
We believe that the reason why the GRPG has more in- 
formational content than the ERPG is that the GRPG 
grows as a tree and this growth gives additional informa- 
tion content for navigation of the network. 
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FIG. 16: Top panels: distribution P(d) for the average min- 
imal path d between all the pairs of vertices of the network. 
Left panel: comparison between DLN and the DGM. The 
DLN distribution is well fitted by a Gaussian distribution 
(reduced \ 2 = 6.8 * 10 -6 ), while the DGM by a lognormal 
distribution (reduced x 2 = 6.1 *10 -6 ). Right panel: compari- 
son between the DLN, the DERPG and the DGRPG. A semi- 
log scale is applied to better resolve the tails and Gaussian 
fits are performed to clarify the deviations. Bottom panels: 
in the left panel the distribution P{C B ) for the betweenness 
centrality C B for the DLN, the DERPG, the DGRPG and the 
DGM. In the right panel a particular view of the measured 
P(C B ) for the DGM fitted by a Gaussian distribution. 

The betweenness centrality for vertex v is defined 

as 

pB _ 2 \p 9ivj ( 7~\ 

(V-l)(V-2) ^ g H 1 ' 

where <?y is the number of geodesies (shortest paths) 
connecting vertices i and j and gi V j is the number of 
geodesies connecting vertices i and j that contain vertex 
v. is if v has degree one, that is if it represents a 
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dead end road. The normalisation factor takes account 
of the fact that the maximum value for the betweenness 
centrality is achieved for the central vertex of a star graph 
[271 ]. Hence C B is a measure of how probable it is to travel 
on a certain road when moving from one a road to another 
in a city. The distribution function P(C B ) describes the 
hierarchy of betweenness centrality, if any exists. In the 
bottom left panel of Fig[T|51 we show P(C B ) measured 
for our dual networks. Again we see that this measure 
provides a good classification for the different networks. 
In particular, we see that in the DLN, in the DERPG 
and in the DGRPG a scaling distribution emerges, im- 
plying a hierarchy in the centrality of the roads. For the 
DGM, we observe a scaling relation for low values of CB 
related to the tree structures of the DGM (see FigfTTj). 
Then C B is Gaussian distributed around a well defined 
average (see bottom right panel of Fig fl6|) and this im- 
plies that in a grid, the information content of roads is 
nearly equivalent. The values assumed by C B are related 
to the number of different equivalent geodesies gij that 
join different roads, where as <?y increases, C B falls. In 
this sense, we can understand the displacement of the 
distributions, the extremes being the DGM, where many 
equivalent geodesies exist between two roads, and the 
DERPG where not many different choices exist in trav- 
eling from one point to another in the graph. In the 
between, we find that the DLN and the DGRPG have 
similar behaviour. We thus believe that the tree growing 
structure of the DGRPG is very important in reproduc- 
ing the hierarchy of the betweenness centrality associated 
with roads in the DLN. 



V. CONCLUSIONS 

A network theory approach to the study of planar 
graphs and urban networks is a natural consequence of 
the study of growing cities that fill their space in the man- 
ner of self-organising systems but it has not been widely 
explored to date. Indeed in this paper, we are the first 
to demonstrate how this can be useful for providing a 
description of urban growth. Many of the concepts that 
are crucial in urban planning, such as accessibility and 
density used in measuring urban sprawl find natural def- 
initions in the interplay between these primary and dual 
representations of urban systems [28| . 

In this paper, we have begun a deeper analysis of street 
networks for large cities where we develop both primal 
and dual representations. To contextualise these results, 
we considered three models for generating planar graphs 
based a grid, a static planar graph and a growing planar 
graph. To our knowledge, this is the first time that a 
growing planar graph has been introduced for this kind 
of urban analysis, where we have illustrated that many 
geometrical and topological features of the LN are emerg- 
ing properties of a growing system and that the GRPG 
is the best null model for understanding correlations and 
properties of the LN. 



In its primary representation, we found that the degree 
distribution of the LN is not a trivial outcome of the pla- 
narity criteria. It is quite different from the exponential 
degree distribution that we found for the GRPG, and 
this is clearly a result of more complex underlying organ- 
isation principles. Also in its primary representation, we 
have explored its topological and geometrical properties 
and these measures in the cycle space contra that the 
properties of planar graphs provide a richer texture for 
description and analysis than envisaged hitherto. 

In its dual representation, we have had the opportu- 
nity to observe how a real system is very different from a 
random one in terms of its information space. Interest- 
ingly we found that if the degree distribution of the DLN 
is scale free, those in random planar graphs are exponen- 
tial. This means that the scale free distribution found in 
urban structure is a signature of a complex organisation 
within the information space and the parallels with clas- 
sical topological networks behaviour are thus straightfor- 
ward [121 ]. 

In our view, the underlying principles of the organi- 
sation of the street network of large cities like London 
can be framed through a comparison of the centrality 
measures in their primary and dual representations. In 
fact, we found that while the GM is easy to navigate 
in terms of its information space, it is costly to navi- 
gate in metrical space, and while the GRPG is easier to 
navigate in its metrical space, it is difficult to navigate 
in its information space. Thus the LN appears to be a 
self-organised compromise between those two models, a 
system that balances the effort in spatial displacement 
which attempts to minimise the amount of information 
that acts to generate that displacement. Further devel- 
opments of this research will pursue the applicability of 
this network model in developing descriptions and anal- 
ysis of urban systems that reflect least effort principles. 
Moreover a more detailed analysis of growing random 
planar graphs with different arc length distributions will 
be interesting to understand more general properties of 
growing random planar graphs. 
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APPENDIX A: THE LONDON STREET 
NETWORK 

The London street network was derived from two Ord- 
nance Survey (OS) dataset products [29(, OS Meridi- 
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anTM 2 which includes Motorways, A Roads, B Roads 
and Minor Roads, and the OS Integrated Transport Net- 
work (ITN). The latter includes all the above roads but 
in more detail with respect to a much greater number 
of minor roads. The reason two networks were used was 
that the ITN layer contains more detailed street geome- 
try such as traffic islands and roundabouts and therefore 
more edges and vertices. Many of these were not needed 
for the analysis as we were only interested in roads con- 
necting to other roads but this data provided the detail 
needed for construction of the full network. For example, 
each lane entering into the roundabout was represented 
as a separate vertex while traffic islands have two edges 
and two vertices. To reduce the number of vertices and 
edges, roads in the ITN layer that were represented in 
Meridian dataset were removed (through a buffering op- 
eration). This left only the minor roads that where not 
part of Meridian network which could then be snapped 
to the Meridian network. 



integer number 1 < ID < E. We randomly pick an 
edge of the graph and for each of its vertices, we consider 
all the edges intersecting at the vertex. Then we rank 
pairs of edges according the maximum convex angle they 
form. We next consider the pair of edges forming the 
larger convex angle and we relabel the edge with major 
ID giving to it the ID of the other edge. We repeat 
this operation for the remaining edges at the intersection. 
If the number of edges at the intersection is odd, then 
the last edge in the hierarchy of convex angles is not 
relabelled. If the edges forming the major convex angle 
are adjacent, then we relabel them according to the above 
description, and we leave the ID of all the other edges at 
the intersection unchanged. We repeat this process for 
N ~ E 3 / 2 times. 



APPENDIX B: THE ICNP ALGORITHM 



We start with a planar graph © = {V, E} in which 
every edge has a different label or ID, represented by an 
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It is worth to notice that in graph theory the dual rep- 
resentation of a planar graph has a different meaning. In 
particular for a planar graph 25, the dual graph 5 is the 
graph in which the faces of 25 are the vertices and two 
vertices are connected whenever the faces they represent 
share the same boundary in 25. Nevertheless in this paper 
we follow the definitions introduced in physics reviews. 



